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Abstract We start by reviewing recent probabilistic results on ergodic sums in a 
large class of (non-uniformly) hyperbolic dynamical systems. Namely, we describe 
the central limit theorem, the almost-sure convergence to the Gaussian and other 
stable laws, and large deviations. 

Next, we describe a new branch in the study of probabilistic properties of dynamical 
systems, namely concentration inequalities. They allow to describe the fluctuations 
of very general observables and to get bounds rather than limit laws. 
We end up with two sections: one gathering various open problems, notably on 
random dynamical systems, coupled map lattices and so-called nonconventional er- 
godic averages; and another one giving pointers to the literature about moderate 
deviations, almost-sure invariance principle, etc. 
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1 Introduction 

The aim of the present chapter is to roughly describe the current state of the theory 
of statistical or probabilistic properties of 'chaotic' dynamical systems. We shall 
restrict ourselves to discrete-time dynamical systems, although many of the results 
we review have their counterparts in flows. The basic setting is thus a state space £2 
(typically a piece of II'') and a map T : Q. (3- The orbit of an initial condition xq is 
the sequence of points xq,xi = Txq,X2 — Txi,. . . or {T k xo;k — 0, 1, . . .} (where T k 
is the k-fo\d composition of T with itself). 

The core of the probabilistic approach is the description of asymptotic time- 
averages of 'observables', that is, functions / : Q, — > E,. This implies that tran- 
sients become irrelevant, although transient effects may cause formidable problems 
in practice. The corner stone of this approach is Birkhoff 's ergodic theorem. It tells 
us that, given a measure ji left invariant by T, 'the asymptotic time-average of / co- 
incide with the space-average / /dju', except on a set of measure zero with respect 
to this measure. The drawback of this result is that chaotic systems typically possess 
uncountably many invariant ergodic measures. Is there a 'natural' choice ? 

In this chapter, we focus on dissipative systems whose orbits settle on an at- 
tractor which has typically a volume (Lebesgue measure) equal to zero. In these 
systems, the dynamics contracts volumes but generally not in all directions: some 
directions may be stretched, provided some others are so much contracted that the 
final volume is smaller than the initial volume. This implies that, even in a dissipa- 
tive system, the motion after transients may be unstable within the attractor. This 
instability manifests itself by an exponential separation of orbits, as time goes on, 
of points which initially are very close to each other on the attractor. The exponen- 
tial separation takes place in the direction of stretching. Such an attractor is called 
chaotic. Of course, since the attractor is bounded, exponential separation can only 
hold as long as distances are small. 

A famous attractor is the Henon attractor generated by a two-dimensional map 
with two parameters. For some parameters, it is easy to numerically produce a 'pic- 
ture' of the attractor. The standard way to make it is to pick 'at random' an initial 
condition in the basin of the attractor and to plot the first thousand iterates of its 
orbit (see Fig. [3}. On the one hand, why does what is observed has something to do 
with the attractor since, as noticed above, it has zero volume ? On the other hand, we 
know that orbits of the Henon map are not all the same: some are periodic, others 
are not; some come closer to the 'turns' than others. We also know from experience 
that (for a fixed T) one gets essentially the same picture independent of the choice 
of initial condition. Is there a mathematical explanation for this ? 

These questions motivated the idea of Sinai-Ruelle-Bowen or SRB measures. Our 
computer picture can be thought as the picture of a probability measure giving mass 
1 jn to each point in an orbit of length n. Let 8 X be the point mass at x. Is there a 
(probability) measure jx with the property that i Y!!=o ^tHx) ~~ ^ M f° r ' most ' choices 
of initial conditions x, that would explain why our pictures look similar ? If such 
a measure does exist, it has very special properties: like all invariant probability 
measures, it must be supported on the attractor, but it has the peculiar ability to 
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influence orbits starting from various parts of the basin, including points rather far 
away from the support of ji. In some sense, SRB measures are the observable or 
physical measures. 

Mathematically speaking, the theory of chaotic attractors began with the ergodic 
theory of differentiable dynamical systems, more specifically the theory of hyper- 
bolic dynamical systems, where geometry plays a prominent role. The first systems 
studied in the 1960-70's were the so-called Anosov and Axiom A systems which 
are 'uniformly' hyperbolic and in some sense the most chaotic systems. The main 
results were obtained by Sinai, Ruelle and Bowen. They essentially relied on the 
fact that, for such systems, it is possible to construct Markov partitions enabling one 
to identify points in the state space with configurations in one-dimensional lattice 
systems of statistical mechanics ||9l . 

The 1970's brought new outlooks and new challenges. With the aid of computer 
graphics, an abundance of examples showed up whose dynamics is dominated by 
expansions and contractions, but which do not satisfy the stringent requirements of 
Axiom A systems. Henon's attractor mentioned above is a typical example. This led 
to a more comprehensive theory dealing with non-uniformly hyperbolic dynamical 
systems developed abstractly by Pesin and others ||40l Chap. 2]. A breakthrough 
was made by L.-S. Young at the end of the 1990's ll56l l57l . She proposed a more 
'phenomenological' approach to describe in a unified framework many examples of 
systems with a 'localized' source of non-hyperbolicity. In particular, this provided 
tools to prove the existence of an SRB measure for the Henon attractor (for a set 
of parameters with positive measure), see 0. In this chapter, we shall focus on the 
class of systems defined by Young. 

Once we know that our dynamical system (£2,T) admits an SRB measure, we can 
ask for its probabilistic properties. Indeed, it can be viewed as a stationary stochastic 
process: the orbits (x, Tx, ...), where x is distributed according to ji, generate a 
stationary process whose finite-dimensional marginals are the measures jj.„ on £2" 
given by 

d^„(x , ■ ■ ■ ,x„-i) = dn(xo)8 Xt=T x ■ ■ ■ S Xn _ l=TXll _ 2 - 

This is not a product measure but the idea is that, if the system is chaotic enough, T k x 
is more or less independent of x provided k is large, making the process (x, Tx, . . .) 
behave like an independent process. 

Given any observable / : Q —> It, one can generate a process {X„ = fo T n ;n > 0} 

on the probability space (Q,jj.). The ergodic sum S„f(x) = f(x) + f(Tx) H + 

f(T n ~ l x) is thus the partial sum of the process {X n ;n > 0} and one can ask various 
natural questions. For instance, what is the typical size of fluctuations of y 1 S n f(x) 
around / fdfi ? What is the probability that ^S n f(x) deviates from //d/i by more 
than some prescribed value ? Does S n f, appropriately renormalized, converge in law 
? In other words, can we prove a central limit theorem ? Can we get a description 
of large deviations ? Can we have Gaussian but also non-Gaussian limit laws ? This 
kind of results are called limit theorems. 

There are many quantities describing a dynamical system which can be in prin- 
ciple computed by observing its orbits. But the corresponding estimators are not 
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as simple as ergodic sums of suitably chosen observables. A prominent example 
(see below for details) is the periodogram which is related to the power spectrum. 
Therefore it is desirable to have a tool which allows to quantify fluctuations of 
fairly general observables for finite-length orbits. This is the scope of concentra- 
tion inequalities, a new branch in the study of probabilistic theory of dynamical 
systems (and a quite recent branch of Probability theory as well (49 |). The aim of 
concentration inequalities is to quantify the size of the deviations of an observable 
K(x, Tx, T"~ 1 x) around its expectation, where K : Q" — > 1R is an observable of 
n variables of an arbitrary expression. An ergodic sum is a very special case of such 
an observable and we shall see below various examples. What is imposed on K is 
sufficient smoothness (Lipschitz property). Depending on the 'degree of chaos' in 
the system, the deviations of K with respect to its expectation can have an extremely 
small probability. 

From the technical viewpoint, the tool of paramount importance is the transfer 
or Ruelle 's Perron-Frobenius operator. This is the spectral approach to dynamical 
systems. We refer to book of Baladi HI and to the lecture notes of Hennion and 
Herve [41 1 for a throughout exposition. 

Our purpose is to give a sample of recent results on the fluctuations of observables 
in the ergodic theory of non-uniformly hyperbolic dynamical systems. Needless to 
say that the overwhelming list of works in this area renders futile any attempt at 
an exhaustive or even comprehensive treatment within the confines of this chapter. 
Hopefully, this chapter provides a panoramic view of this subject. We also provide 
a list of directions for further research. 

Before describing the contents of this chapter, a few words are in order about 
the bibliography. We urge the reader to consult ll42l in which are gathered landmark 
papers illustrating the history and development of the notions of chaotic attractors 
and their 'natural' invariant measures. For numerical implementations of the theory, 
it is still worth reading the review paper by Eckmann and Ruelle [ 30 1 . A more recent 
reference, dealing both with theoretical and numerical aspects is the book by Collet 
and Eckmann [24|. Needless to say that the potential list of references is gigantic. 
Limitation of space and time forced us painfully to exclude many relevant papers. 
As a matter of principle, and whenever possible, we refer to the most recent articles 
which contain relevant pointers to the literature. We apologize for omissions. 
Layout of the chapter. In Section[2]we describe the probability approach to dynam- 
ical systems and recall Birkhoff 's ergodic theorem. In Section [3] we describe the 
class of hyperbolic dynamical systems we will be working with. In particular, we 
quickly describe Young towers and SRB measures, and give several examples which 
will be used throughout the chapter. Section|4]is devoted to mixing (decay of corre- 
lations) and limit theorems, namely: the central limit theorem, convergence to non- 
Gaussian laws, exponential and sub-exponential large deviations, and convergence 
in law made almost sure. Section[5]is concerned with concentration inequalities and 
some of their applications. In Section [6] we provide a list of open problems and 
questions related to random dynamical systems, coupled map lattices, partially hy- 
perbolic systems, and the Erdos-Renyi law. We end with a section where we quickly 
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survey results not detailed in the main text. This includes Berry-Esseen theorem, 
moderate deviations and the almost-sure invariance principle. 



2 Generalities 

We state some general definitions and recall Birkhoff 's ergodic theorem. 



2.1 Dynamical systems and observable s 

In this chapter, by 'dynamical system' we mean a deterministic dynamical system 
with discrete time, that is, a transformation T : £2 of its state space (or phase 
space) £2 into itself. For the sake of concreteness, one can think of £2 as a compact 
subset of R''. Mathematically speaking, one can deal with a compact riemannian 
manifold. 

Every point x € £2 represents a possible state of the system. If the system is in 
state x, then it will be in state T(x) in the next moment of time. Given the current 
state x = xq 6 £2, the sequence of states 



represents the entire future or forward orbit of xq. We have x„ — T"xq, where T" is 
the re-fold composition of T with itself. If the map T is invertible, then the past of 
xq can be determined as well (jc_„ = T~ n xo). 

In applications, the actual states x„ € £2 are often not observable. Instead, we 
usually observe the values f(x„) taken by a function / on £2, usually called an 
observable. One can be thought off as an instantaneous measurement of the system. 
For the sake of simplicity, we consider / to be real-valued. 

More generally, we may observe the system from time up to time re — 1 and 
associate to x, Tx, . . . , T n x a real number K(x, Tx, . .. , T"~ l x). In the language of 
statistics, K : £2" —> R is called an estimator. The fundamental example is the Cesaro 
or ergodic average of an 'instantaneous' observable / : £2 — s- R along an orbit up 



to time re - 1: K Q (x,Tx,...,T"- l x) := (f(x)+f(Tx)-\ h f(T n - l x))/n. This is 



an example of an additive observable. There are many natural examples which are 
not as simple. An important example is the periodogram used to estimate the power 
spectrum of a 'signal' {f(xj l );k = 0, ...,«— 1}. We give its definition below as well 



x\ = Txq, x% = Txi,..., x„ 



as other examples; see section 
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2.2 Dynamical systems as stochastic processes 



Ergodic theory is concerned with measure-preserving transformations, meaning 
that the map T preserves a probability measure ji on £2: for any measurable sub- 
set A C £2 one has n(A) — n(T^ 1 (A)), where T~ l (A) denotes the set of points 
mapped into A. The invariant measure jj. describes the distribution of the sequence 
{x„ — T n (xo)} for typical initial states xq. This vague statement is made precise 
by Birkhoff 's ergodic theorem; see below. For a large class of non-uniformly hy- 
perbolic systems, there is a 'natural' invariant measure, the so-called Sinai-Ruelle- 
Bowen measure (SRB measure for short). 

A measure-preserving dynamical system is thus a probability space (£2,3$, /J.) 
endowed with a transformation T : £2 leaving ji invariant. An important notion 
is that of an ergodic dynamical system. The invariant measure ji is said to be er- 
godic (with respect to T) whenever T (E) — E implies fl(E) = or fi(E) = 1. 
Equivalently, ergodicity means that any invariant function g : £2 — > R is /i-almost 
everywhere constant. That g be invariant means that g = g o T, In the measure- 
theoretic sense, ergodic measures are indecomposable and any invariant measure 
can be disintegrated into its ergodic components fl4l . 

A measure-preserving dynamical system can be viewed as a stochastic process: 
the orbits (x, Tx, ...), where x is distributed according to jj., generate a stationary 
process whose finite-dimensional marginals are the measures jj.„ on £2" given by 

dji n (x , . . . ,x n -i) = d/z(xo)5 ri= r.r • • • S Xii _ l=TXn _ 2 - 

This is not a product measure but the idea is that, if the system is chaotic enough, T k x 
is more or less independent of x provided k is large, making the process (x, Tx, . . .) 
behave like an independent process. 

Given an observable / : £2 — > JR, = fo T , for each k > 0, is a random variable 
on the probability space (£2 , 33 , fi) . The family {X n ;n > 0} is a real-valued station- 
ary process. The ergodic sum S„f(x) = fix) + f(Tx) H h f(T"~ l x) is thus the 

partial sum of the process {X n ;n > 0}. 

We shall make no attempt to define precisely what a chaotic dynamical system is. 
From the point of view of this chapter, we can vaguely state that it is a system such 
that, for sufficiently nice observables /, the process {/ o T k } behave as an i.i.dQ 
process. Along the way, this crude statement will be refined. 



2.3 Birkhoff s Ergodic Theorem 

The fundamental theorem in ergodic theory is Birkhoff 's ergodic theorem which is 
a far reaching generalization of Kolmogorov's strong law of large numbers for an 
independent process l46ll . 



i.i.d. stands for 'independent and identically distributed' 
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Theorem 1 (Birkhoff 's ergodic theorem). 

Let (£2 , 33, jl) be a dynamical system and f : £2 — > R be an integrable observ- 
able (J |/|djli < °o). Then 

lim -S n f{x) = f* (x), jLt — almost surely and in L 1 (ji), 

h->°° n 

where the function f* is invariant (f* — f* o T, pt-a.s.) and such that 

./'/ciu .r/du. 

If the dynamical system is ergodic, then f* is jX-almost surely a constant, 
whence 

lim -S„f(x) — / /d;ii, pt — almost surely. 
n J 



Remark 1. The previous theorem, spelled out for an integrable stationary ergodic 
process {X,,}, reads n _1 L"=o^/ ~~ ► E[Xb] almost surely. In the non-ergodic case 
convergence is to the conditional expectation of Xq with respect to the ff-algebra of 
invariant sets, see l46l for details. 

Very often, C2 is compact and it is not difficult to show that there exists a mea- 
surable set of /z-measure one such that, in the ergodic case, g : Q — » R 

lim -S„g{x) = I gdjU 
«->oo n J 

for any continuous observables. Equivalently, this means that the empirical measure 
of /i-almost every x converges towards fi in the vague (or weak-*) topology: 

j n— 1 

- " Y > M almost surely. 

n 7=0 

The advantage of Birkhoff 's ergodic theorem is its generality. Its drawback is that a 
chaotic system has in general uncountably many distinct ergodic measures. Which 
one do we choose ? We shall see later on that the idea of a Sinai-Ruelle-Bowen 
measure provides an answer. 



2.4 Speed of convergence and fluctuations 

It is well known that not much can be said about the speed of convergence of the 
ergodic average to its limit in Theorem [T] First of all, one cannot know in practice 
if we are observing a typical orbit for which convergence indeed occurs. But even if 
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we knew that we have a typical orbit, it can be shown that the convergence can be 
arbitrarily slow (see for instance [43 1 for a survey). 

To obtain more informations about the fluctuations of ergodic sums around their 
limit, we need a probabilistic formulation. Maybe the most natural question is the 
following: 

what is the speed of convergence to zero of the probability that the er- 
godic average differs from its limit by more than a prescribed value ? 

Formally, we want to know the speed of convergence to zero of 



for t > small enough and for a large class of continuous observables /. (By 
Birkhoff's ergodic theorem, all what we know is that this probability goes to as 
n goes to infinity.) In probabilistic terminology, we want to know the speed of con- 
vergence in probability of ergodic averages to their limit. By analogy with bounded 
i.i.d. processes, this speed should be exponential for 'sufficiently chaotic' systems. 
We shall see that it can be only polynomial when mixing is not strong enough. 

Another natural issue is to determine the order of typical values of S n f — n J fdji . 
By analogy with a square-integrable i.i.d. process, one can expect this order to be 
y/n, and, more precisely, that a central limit theorem may hold. We shall see that 
this is indeed the case for 'nice observables' and sufficiently chaotic systems. When 
chaos is 'too weak', the central limit theorem may fail and the asymptotic distribu- 
tion may be non-Gaussian. 

The previous issues are formulated in terms of limit theorems and concern er- 
godic sums. From the point of view of applications, an important problem is to 
estimate the probability of deviation of a general observable K(x, Tx, . . . , T n ~ l x) 
from its expected value. Formally, we ask if it is possible to find a positive function 
b(n,t) such that 



for any t > and for any n £ IN, with b(n,t) depending on K. When b(n,t) decreases 
'rapidly' with t and n, this means that K(x, Tx, . . . , T"~ l x) is 'concentrated' around 
its expected value. It turns out that when the dynamical system is 'chaotic enough', 
this concentration phenomenon is very sharp. 

To be able to answer the kind of previous questions, we shall need to make hy- 
potheses on the dynamical systems as well as on the class of observables. Usually, 
Holder continuous functions are suitable. 




H<xe£2 : K(x,Tx. 



;...,T"- 1 x)- j ' K{y,Ty,...,T n - l yW{y)\>t\<b{n,t) 
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We quickly and roughly describe the class of dynamical systems for which one can 
prove various probabilistic results. These systems are used to model deterministic 
chaos which is caused by dynamic instability, or sensitive dependence on initial 
conditions, together with the fact that orbits are confined in a compact region. 



3.1 Hyperbolic dynamical systems 

The basic model for sensitive dependence on initial conditions is that of a uniformly 
expanding map T on a riemannian compact manifold £2 : T is smooth and there are 
constants C > and A > 1 such that for any x £ £2 and v in the tangent space at x 
and for any n £ IN 

||Z)r"(x)v||>CA"||v||. 

The prototypical example is T(x) = 2x (mod 1) on £2 = S 1 (the unit circle), which is 
usually identified with the interval [0, 1). The Lebesgue measure is invariant in this 
case. 

Uniformly hyperbolic maps have the property that at each point x the tangent 
space is a direct sum of two subspaces E" and E s x , one of which is expanded 
( 1 1 DT" (x) v 1 1 > CA" 1 1 v 1 1 for v £ Ef) and the other contracted (||Dr B (jc)v|| <CA~ n ||v|| 
for v £ E x ). The prototypical example is Arnold's cat map (x,y) h-> (2x + y,x+y) 
(mod 1) of the unit torus. 

Non-uniform hyperbolicity refers to the fact that C — C(x) > and A = A (x) > 1 
almost-everywhere: in words, the constants depend on x and they have nice proper- 
ties only on a set a full measure. For instance, the presence of a single point where 
A(x) = 1 already causes important difficulties (the fundamental example being an 
interval map with an indifferent fixed point at 0). Another instance of loose of uni- 
form hyperbolicity is when there is a point where the differential of T vanishes 
(e.g., the quadratic map or the Henon map). A third typical situation is when the 
differential has discontinuities. This is the case for the Lozi map and billiards, for 
instance. 



3.2 Attractors 

We are especially interested in dissipative systems with an attractor, that is, volume- 
contracting maps T with an attractor A . By an attractor we refer to a compact invari- 
ant set with the property that all points in a neighborhood U of A (called its basin) 
are attracted to A (i.e. for any x£U, T"x — > A as n — > °°). 

The prototype of a hyperbolic attractor is an Axiom A attractor. It is a smooth 
map T with an attractor A on which T is uniformly hyperbolic. These systems can 
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be viewed as subshifts of finite type by using a Markov partition: one can assign to 
each point a bi-infinite symbol sequence describing its itinerary. This sequence can 
be thought of as a configuration in a one-dimensional statistical mechanical system. 
Special measures, called SRB measures (see next section) can be constructed by 
pulling back adequate Gibbs measures which are invariant by the shift map; see |9] 
and E3 Chap. 4]. 

Henon's attractor is a genuinely non-uniformly hyperbolic attractor which re- 
sisted to mathematical analysis till the 1990's. 



3.3 Sinai-Ruelle-Bowen measures 

We shall not define precisely Sinai-Ruelle-Bowen (SRB for short) measures but con- 
tent ourselves by saying that they are the invariant measures most compatible with 
volume (Lebesgue measure) when volume is not preserved. Technically speaking, 
they have absolutely continuous conditional measures along unstable manifolds and 
a positive Lyapunov exponent. They provide a mechanism for explaining how local 
instability on attractors can produce coherent statistics for orbits starting from large 
sets in the basin. In particular, an SRB measure fi is 'observable' in the following 
sense: there exists a subset V of the basin of attraction with positive Lebesgue mea- 
sure such that for any continuous observable / on Q and any initial state x € V we 
have 

lim-£/(7^)= [fdn, 

n ^°° n j=Q J 

or, more compactly 

i n— 1 

J- v~ i r. vaguely 

„L 8 Tix > M- 

n j=0 

The point of this property is that the set of 'good states' has positive Lebesgue 
measure although the measure fl is concentrated on the attractor which has zero 
Lebesgue measure. (Notice that this property does not follow from Birkhoff's er- 
godic theorem.) 

For one-dimensional maps, absolutely continuous invariant measures (with re- 
spect to Lebesgue measure) are examples of SRB measures. 

Roughly speaking, the approach to non-uniformly hyperbolic systems of L.-S. 
Young, which will be sketched below, can be considered as 'phenomenological' in 
the sense that it aims at modeling concrete dynamical behaviors observed in various 
examples. An 'axiomatic approach' can be followed which seeks to relax the condi- 
tions that define Axiom A systems in the hope of systematically enlarging the set of 
maps with SRB measures. For an account on this second approach, we refer to [40, 
Chap. 2]. For a nice and non-technical survey on SRB measures, we recommend 
reading ED. 
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3.4 Dynamical systems modeled by a Young tower 

In the 1970s, many examples were numerically observed whose dynamics are dom- 
inated by expansions and contractions but which do not meet the stringent require- 
ments of Axiom A systems. The most famous example is likely the Henon mapping 
which displays a 'strange attractor' for certain parameters. Such examples remained 
mathematically intractable until the 1990's. 

L.-S. Young developped a general scheme to study the probabilistic properties 
of a class of 'predominantly hyperbolic' dynamical systems, including the Henon 
attractor and other famous examples. Very roughly the picture is as follows. The 
general set up is that T : Q Q is a nonuniformly hyperbolic system in the sense 
of Young l56l l57l with a return time function R that decays either exponentially 
ll56l . or polynomially lf57ll . In particular, T : Q Q is modeled by a Young tower con- 
structed over a 'uniformly hyperbolic' base Y C £2. The degree of non-uniformity 
is measured by the return time function R : Y — > Z + to the base. 

More precisely, by a classical construction in ergodic theory, one can construct 
from (y, T R ) an extension (A,F), called a Young tower in the present setting. In 
particular, there exists a continuous map % : A —> £2 such that % o F = T o %. In 
general % need not be one-to-one or onto. One can visualize a tower by writing that 
A = U™ =0 A( where Ag can be identified with the set {x GY : R(x) > £}, that is, the 
£-th floor of the tower. In particular, Aq is identified with Y. The dynamics in the 
tower is as follows: each point x £ A$ moves up the tower until it reaches the top 
level above x, after which it returns to Aq, see Fig.[T] Moreover, F has a countable 
Markov partition {Ap ,-} with the property that % maps each Ag j injectively onto 
y, which has a hyperbolic product structure. Each of the local unstable manifolds 
defining the product structure of n(Ao) meet 7t(Ao) in a set of positive Lebesgue 
measure. Further analytic and regularity conditions are imposed. We shall not give 
further details and refer the reader to [56 57) and 1 19 1. 

Fig. 1 Schematic representa- 
tion of the tower map F : A O- 



Systems modeled by Young towers are more flexible than Axiom A systems in 
that they are permitted to be non-uniformly hyperbolic: roughly speaking, think of 
uniform hyperbolicity as required only for the return map to the base. Reasonable 
singularities and discontinuities are also allowed: they do not appear in Y. As we 
shall see, a number of probabilistic properties of T : Q Q are actually captured by 
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the tail properties of R. The basic result proved in ll56l l57l is the following, where 
m" denotes Lebesgue measure on unstable manifolds. 

Theorem 2. Let T : £2 Q be a dynamical system modeled by a Young tower. 
If J Rdm" < oa, then T has an ergodic SRB measure. If gcd{Rf\ — 1, there is 
a unique SRB measure denoted by fl. 

Of course, / Rdm" — £„>i m"{R > n}. In the sequel, we shall implicitly assume that 
gcd{/?, } = 1, without loss of generality. 



3.5 Some examples 

The best known example of a non-uniformly expanding map of the interval is the so- 
called Maneville-Pomeau map modelling intermittency. It is expanding except at 
where the slope of the map is one (neutral fixed point). For the sake of definitenes^J 
consider the map 



where a £ (0, 1) is a parameter. It is well-known that there is a unique absolutely 
continuous invariant measure dji(x) = h(x)dx and h(x) ~ x~ a as x — > 0. There is a 
Young tower with base Y = [1 /2, 1] and Leb{y £ Y : R(y) > n} = e(nT x l a ). 

Another fundamental one-dimensional example is given by the quadratic family 
T u : [—1,1] O with T a (x) = 1 — ax 2 , where a £ [1,2], and for which is a critical 
point (the slope vanishes). For a set of parameters of positive Lebesgue measure, 
this maps preserves a unique absolutely continuous probability measure. Its density 
has an inverse square-root singularity. In this example, one can construct a tower 
map with a return-time function which has an exponentially decreasing tail. 

An important example of a dynamical system in the plane modeled by a Young 
tower with a return time decaying exponentially is the Lozi map: 




(1) 




which possesses an attractor depicted in Fig. [2] Lozi's map is much simpler to anal- 
yse than the famous Henon map: 




2 The explicit formula Q is not important, what matters is only the local behavior around the fixed 
point. 
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Fig. 2 Simulation of the Lozi 
attractor for a = 1,7 and 
6 = 0,5. 





D.67 






















-1,28 -1 






■■■ " ^'fc^*''"'" 1,34 












-0.64 







For certain parameters, this map has an attactor displayed in Fig. [3] For the so-called 

Fig. 3 Simulation of the 
Henon attractor for a = 1,4 
and 6 = 0,3. Notice that the 
existing results do not cover 
these 'historical' values. 




Benedicks-Carleson parameter^] it is possible to prove [7| that the Henon attractor 
fits the general scheme of Young towers with exponential tails. In particular, there 
is a unique SRB measure whose support is the attractor. 

Important examples of maps, which are conservative, are billiard maps, like pla- 
nar Lorentz gases and Sinai's billiard. They can be also modeled by Young towers. 
We refer to [56] but also to [21 1 for a conceptual account avoiding technicalities. 



4 Limit theorems 

In this section we review some limit theorems obtained for the class of systems 
previously described. 



4.1 Covariance and decay of correlations 

Definition 1 (Correlations). 

For a dynamical system (Q,T,fl) and an observable / : £2 — > R in L 2 {jx), the 
autocovariance of order I > of the process {/o T k \ k > 0} is defined as 

3 These parameters form a subset of R 2 with positive Lebesgue measure 0. 
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C f {i):= jfj-Vdu ( / .Alu) ; . 

More generally, for a pair f,g of observables in L 2 (jj,), the co variance of order I 
of the processes {foT k ;k > 0} and {go T k ;k > 0} is defined as 

Cf*M := | f-goT e dii-j /d/i J gdfi. 

In dynamical systems, it is customary to call the auto-covariance of order £ the 
"correlation coefficient" of order I. 

The auto-covariance, or more generally, the covariance, is the basic indicator of 
a chaotic behavior: for large values of £, the random variables / and /or should 
be nearly independent, i.e. the coefficient Cf(£) should decay to as I grows. Two 
factors affect the rapidity of this decay: the strength of chaos in the underlying dy- 
namical system T : £2 Q and the regularity of the observable /. 

Recall that a dynamical system (Q, SS, T, /i) is mixing if for any two measurable 
sets A,B C Q one has ju(A n T~"B) > fl(A)fl(B). It is easy to prove that the 

n— >«> 

system is mixing if and only if correlations decay, i.e., Cf ig (£) ► for every pair 

of f,ge l 2 (ai). 

The speed or rate of the decay of correlations (also called the rate of mixing) is 
crucial in the statistical analysis of chaotic systems. 



Theorem 3 (Mixing and decay of correlations |56 , 57, 55., 35 1). 
Let T : Q Q be a dynamical system modeled by a Young tower and H its SRB 
measure. The system is mixing and the rate of decay of correlations for Holder 
continuous observables is directly related to the behavior of m"{R > n} as 
n — > oo. 

• For example, if m"{R > n} = G{eT an ") for some a > 0, then (T,fl) has 
exponential decay of correlations. 

• Ifm"{R > n} — 6 \\ /nX) for some y > 1, then (T,[l) has polynomial decay 
of correlations. More precisely, C/(£) = 'P^). 



For the Henon map with Benedicks-Carleson parameters, correlations for Holder 
continuous observables decay exponentially fast. The intermittent map ([TJ has poly- 
nomial decay of correlations: Cf ■(£) = 0(\j£ «~'). Two-dimensional examples with 
an intermittent behavior come from billiards. Chernov and Zhang studied in Il22ll23l 
several classes of billiards for which the decay of correlations is <ff((log£) c /£ l / a ~ l ) 
for some parameter a taking values in (0, 1 /2]. 
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We start by a definition. 

Definition 2 (Central limit theorem). 

Let (Q,T,n) be a dynamical system and / : £2 — > 1R an observable in L 2 (jj,). We 
say that / satisfies the central limit theorem (CLT for short) with respect to (7\ju) 
if there exists oy > such that 

l im Jx:^»< ( ) = 7 Lf;i, VreR. (2) 



'2;KJ/ . 

In probabilistic notation, the previous convergence is written compactly as 

Snf-nffdjj, bw 

where JY^ B i stands for the Gaussian law with mean and variance oj. 

When Of = the right-hand side has to be understood as the Heaviside function. 

In probabilistic terms, this definition asks for the convergence in law of the er- 
godic average 'zoomed out' by the factor yjn to a random variable whose law is 

By analogy with i.i.d. processes, one expects that oy be the variance of the pro- 
cess {fo T"}. If it were an i.i.d. process, we would have 

aj = Var (S n f/y/H) = J fd^i - (| fd^j djU = C,(0), 

where Var(X) = E[(X- E(X)) 2 ] is the variance of X. But because of the correla- 
tions between / and fo T n , this is not the case. A natural candidate for the variance 
is 

aj=lim-J (S„f-n J fdjxfdjx, 
provided the limit exists. Simple algebra, using the invariance of jj, under T, gives 

1 / (Snf-n //d Al ) 2 d J u-C / (0)+2X; 1 — C f (£). 



It is simple to prove that if 



Lic/U)i<~ 

;'=i 



then 

SlL^c / (/) = £c / (/) 1 
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whence 



aj = C f (0)+2^C f (i). 



(3) 



We have the following theorem. 



Theorem 4 (Central limit theorem, [56, 57 1). 

Let T : £2 Q be a dynamical system modeled by a Young tower and 11 its SRB 
measure. Let f : Q. — > 1R be a Holder continuous observable. If J R 2 dm" < °° 
(which implies Y,e>l \Cf{@)\ < °°), then f satisfies the central limit theorem 
with respect to (T,n). 

For the class of systems discussed in this paper, it is well-known that typically 
ai > 0. Indeed, ai — only for Holder observables lying in a closed subspace of 
infinite codimension. 

For example, Holder continuous observables satisfy the CLT for the Henon map 
with Benedicks-Carleson parameters. For the map ([TJ, the CLT holds if a < 1 /2. 
We shall see what happens when a > 1/2 later on. 

There are examples of convergence to the Gaussian law but with a non-classical 
renormalizing sequence (y/nlogn), instead of (y'n). This is the case for Buni- 
movich's billiard (stadium) where correlations decay only as l/n (where n is the 
number of collisions); see J2). 

In essence, the central limit theorem tells us that typically {i.e. with very high 
probability), 



In other words, the typical fluctuations of S„f/n around J fdfi are of order \j yjn. 
But, in principle, S„f can take values as large as n, i.e. S n f/n — //dju can be of 
order one, but with a small probability. Such fluctuations are naturally called 'large 
deviations'. This is the subject of the next section. 



4.3 Large deviations 

For a bounded i.i.d. process {X„}, it is a classical result in probability, usually called 
Cramer's theorem 03, that P{ \n~ l (X H h-^n-i) - E[Xq] | > 5} decays expo- 
nentially with n. Moreover, 
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Typically, the function I (the so-called rate function) is strictly convex and vanishes 
only at (rl (hence it is non-negative). Since the process is bounded, its domain is a 
finite interval. The rate function turns out to be the Legendre transform of the cu- 
mulant generating function 6 M> logE[exp(0Xo)]. 

One expects this exponential decay for the probability of deviation in 'sufficiently 
chaotic' dynamical systems and for a Holder continuous observable /. For nota- 
tional convenience, assume that / fdjj. = 0. The goal is to prove that there exists a 
rate function If : R — > [0, +°°] such that 

lim lim -logjii \x G £2 : -S n f(x) G [a — e,a + e] \ = -If (a). 

In many situations, such a result is obtained by proving that the cumulant generating 
function 

ff(z) = lim -log / e^djU 

n^t<=° n J 

exists and is smooth enough for z real in an interval containing the origin. Then 
the rate function is the Legendre transform of Wf, However, as we shall see, when 
chaos is not strong enough, one may indeed get subexponential decay rates for large 
deviations (and therefore there is no rate function). 

For systems modeled by a Young tower with exponential tails, we have the fol- 
lowing result. It turns out that the logarithmic moment generating function WAz) 
can be studied for complex z- 



Theorem 5 (Cumulant generating functions 154, 51 1). 

Let T : Q Q be a dynamical system modeled by a Young tower and 11 its SRB 
measure. Assume that m u {R > «} = ff(e~ an ) for some a > 0. Let f : £2 — > E, 
be a Holder continuous observable such that f fd/i = 0. 

• Then there exist positive numbers TJ = 7] (/) and § = t, (/) such that the 
logarithmic moment generating function Wf exists and is analytic in the 
strip 

{zeC:|Re(z)|<J7,|Im(z)|<^}. 

• In particular, Wj(0) = f fdjX and W'f(0) — crj, which is the variance ^ of 
the process {/o 77"}. Moreover, *Pf(z) is strictly convex for real z provided 
aj>0. 



From this kind of result, one can deduce the following result by using Gartner- 
Ellis theorem or the like (see ET] section 4.5] and ETJ pp. 102-103]). Notice that it 
is enough for Wf to be differentiable to apply this theorem. 



4 The rate function must vanish at in view of Birkhoff 's ergodic theorem. 
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Theorem 6 (Exponential large deviations [51 , 54 1). 

Under the same assumptions as in the previous theorem, let If be the Legendre 
transform of*¥f, i.e. lf(t) — sup^/j, w\{tz — 5^(z)}- Then for any interval 
[a,ft] C [^(-ti),«F/(ti)], 

lim -log;U \x G i2 : -S„/(jc) G kfcll = - inf If(f). 

«->°°n I n ) te[a,b] 



Remark 2. Using a general theorem of Bryc [11], one can deduce the central limit 
theorem from Theorem[5] We stress that analyticity of Wf is necessary. In general, if 
Wf is only C°° (ensuring that Wf{G) = of), it is false than the central limit theorem 
follows from exponential large deviations. 

We now turn to systems modeled by a Young tower with sub-exponential tails. 
In this case, there is no rate function and one gets sub-exponential large deviation 
bounds. 



Theorem 7 (Sub-exponential large deviations |50|). 

Let T : Q Q be a dynamical system modeled by a Young tower and 11 its SRB 
measure. Assume that m"{R > n} = ff(\/n r )for some J > 1. Let f : Q — > 1R 
be a Holder continuous observable such that J fd/J. — 0. Then, for any £ > 



ll < x G Q 



Snf(x) 



>e )■ < 



C 



for any n G IN. 



Notice that according to Theorem|3] the decay is the same as that for correlations. 
The dependence in e of the constant C/. £ is in £~ 2|? where q > max(l, 7— 1). 

Let us again use our favorite example, namely the Manneville-Pomeau map, to 
illustrate the preceding result. In this case, one can also prove a lower bound for 
the probability of large deviations. Indeed, for the map Q, the theorem applies with 
7= where a G (0, 1). Recall that for a G (0, 1 /2), the central limit theorem holds 



(see Section|4T2j>, but it fails when ae [1/2, 1) (See Section|4T4|below). 
Moreover, it is proved in [50] that there is a nonempty open set of Holder observ- 
ables / for which n~a +l is a lower bound for large deviations for n sufficiently 
large. For these observables, we have for any e > 

,. log/i{*e[0,l]:|£S,,/(j)|>e} 1 . 

lim — = hi. (4) 

logn a 
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The purpose of this section is to show what happens when the CLT fails but one still 
has convergence in law, but with a re normalizing sequence different from (y/n). 
For the reader's convenience, we recall the notion of domain of attraction for an 
observable and a classical theorem about stable laws for i.i.d. processes. 

A function /, defined on a probability space (Q,3S,m), is said to belong to a 
domain of attraction if it fulfills one the following three conditions: 

I. It belongs to L 2 (Q). 

II. One has / 1{|/|>jc} d'« ~x~ 2 £(x), for some function £ such that L(x) :=2 f* ^-du 
is of slow variation and unbounded. 

III. There exists p € (1,2) such that 

J l{f >x }dm = (ci +o(l)) X - p L(x) and J l {f< _ x} dm = (c 2 +o(l))x~ p L(x), 

where c\ ,C2 are nonnegative real numbers such that c\ +C2 > 0, and L is of slow 
variation. 

Note that the three conditions are mutually exclusive. 

The above definition of domain of attraction is motivated by the following well- 
known, classical result in Probability (see e.g. Il33l0 : 



Theorem 8 (Convergence to stable laws for i.i.d. processes). 

Let Z be a random variable belonging to a domain of attraction. Let Zq , Z\ , . . . 
be a sequence of independent, identically distributed, random variables with 
the same law as Z. In all cases, we set A„ — nE[Z] and 

1. if condition I holds, we set B n = y/n and W = ,_Aq e[z 2 ]-E[z] 2 >" 

2. if condition II holds, we let B n be a renormalizing sequence with nL(B n ) ~ 
B\, and W = Jfay, 

3. if condition III holds, we let B„ be a renormalizing sequence such that 
nL(B„)~B%. Define c=(c t +c 2 )T(l -p)cos (f) and j3 = 

Let W = W PlC ,B be the law with characteristic function 

E[e wr ]=«- c l , l'( 1 -*'* , W 1ail (?)), (l<p<2,c>0,\P\<l). (5) 

Then 

Y n ~ l 7—A , 
n n n law 
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The case p = 2 corresponds to the Gaussian law. For p < 2, the corresponding 
distributions are said to have 'heavy tails' since P{Z > x} = (c\ +o(l))x~ p and 
P{Z < — x} = (c2 + o(l))x~ p . The conditions put on the distribution of Z are almost 
necessary and sufficient to get a convergence in law of that type, we only restricted 
the range of p's, which could also be taken in the interval (0, 1]. 

We illustrate the occurrence of non-Gaussian limit laws in the most important 
example, that is, the Pomeau-Manneville map ([TJ. 



Theorem 9 (Convergence to stable laws for the Manneville-Pomeau map 
|35|). 

Let T a be the map of the interval with a € (0, 1) and pL its unique ab- 
solutely continuous, invariant, probability measure. Let f : [0, 1] — > R be a 
Holder observable and assume that J /d/i = 0. 

• If CC < 1/2 then the central limit theorem holds ( this is a special instance 
of Theorem^. 

• If a > 1/2 then: 

- if f is Lipschitzian and /(0) = 0, then the central limit theorem holds; 

— if f(0) ^ then ^gS n f converges in law to the stable law Wi c sgn ^m^ 
whose characteristic function is given by Q. 



When a = 1/2 and /(0) =^ 0, there is convergence to the Gaussian law but with 
the unusual renormalizing sequence (y/nlogn) (instead of y/n). See [34] for more 
details. 



4.5 Convergence in law made almost sure 

The aim of this section is to show that whenever we can prove a limit theorem in the 
classical sense for a dynamical system, we can prove a suitable almost-sure version 
based on an empirical measure with log-average. 

The prototype of such a theorem is the almost-sure central limit theorem: if X„ is an 
i.i.d. L 2 sequence with E[X,-] = and E[X?] = 1, then, almost surely, 

1 "1 

i EfV'iV^*^ ( 6 ) 

logn^fc L j= o A j' vli 

where means weak convergence of probability measures on R. Here and 

henceforth, 8 X is the Dirac mass at x. This result should be compared to the clas- 
sical central limit theorem, which can be stated as follows: 
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E[t lv n-i y i <tl ] > —j= \ e-" / 2 du 

for any t G H. To better compare these theorems, it is worth noticing that |6]) implies 
that almost surely 

1 77 1 1 f-f 

, E 7hr*-i y i ft <,\ ► ~/= / e ~" /2d " (7) 

for any t E R. So, instead of taking the expected value, we take a logarithmic average 
and obtain an almost-sure convergence. 

In fact, whenever there is independence and a classical limit theorem, the corre- 
sponding almost-sure limit theorem also holds (under minor technical conditions), 
see |8 | an d references therein. 

Let us put the following general definition: 

Definition 3 (Almost sure limit theorem towards a random variable). 

Let S„ be a sequence of random variables on a probability space, and let B„ 
be a renormalizing sequence. [^] We say that S n /B„ satisfies an almost sure limit 
theorem towards a law W if, for almost all CO, 

1 N 1 

1 n 1 p law 



log A? 



k=l 



We now turn to the dynamical system context. The almost-sure central limit theo- 
rem, for instance, takes the form 



1 " 1 

j— E - k 5 s k mivi< ^ ^o,cj . fOT m - almost ever y x > 



where, for notational simplicity, we assume that f fdjj. = 0. 

In the paper lfl8l . we proved that "whenever we can prove a limit theorem in 
the classical sense for a dynamical system, we can prove a suitable almost-sure ver- 
sion". More precisely, we investigated three methods that are used to prove limit 
theorems in dynamical systems: spectral methods, martingale methods, and induc- 
tion arguments. We showed that whenever these methods apply, the corresponding 
limit theorem admits a suitable almost-sure version. 
For instance, one has the following result. 



Theorem 10 (Convergence in law made almost sure 1 18 1). 

Let T : Q Q be a dynamical system modeled by a Young tower and let 11 its 



5 A renormalization function is a function B : K" + — > Eti of the form B(x) = x?L(x) where d > 
and L is a normalized slowly varying function. The corresponding renormalizing sequence is B„ := 
B(n). 
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SRB measure. Let f : Q. — > R be a Holder continuous observable such that 
J/dju =0. Then, if 

^^ G Q: S ^M<^—^W((-oo,t]) 

for every f6Ea( which W is continuous, for a certain law W and for a 
certain renormalizing sequence (B„), then 

1 " 1 i- 

V -<5c, fin, — ^ W U — almost-surely. 

logn^j k kS/ k 



Let us illustrate this theorem with a few examples. For any dynamical system 
modeled by a Young tower with L 2 tails, one has 

1 " 1 i 

For the Manneville-Pomeau map this is true for a € (0, 1 /2). When a > 1/2, 
this is still the case provided that /(0) = and / is Lipschitz. If /(0) ^ 0, then 

1 "1 

logn £ i S W ^ *V,*»C«o)) 

(see Theorem[9]l. 



5 Concentration inequalities and applications 



5./ Introduction 



We start by the simplest occurrence of the concentration of measure phenomenon 
|49|. Consider an independent sequence of Bernoulli random variables (T};-)o</< n -i 
(i.e. P(rjj = -1) = P(tj,- = 1) = 1/2, whence E[r/,] = 0). Then one has the following 
classical inequality (Chernov's bound): 



P 



/t — i 

i 



>t < 



;2exp (-fc)' 



Vf >0. 



(8) 



This exponential inequality reflects the most important theorem of probability, im- 
precisely stated as follows: "In a long sequence of tossing a fair coin, it is likely that 
heads will come up nearly half of the time." Indeed, if we let B„ be the number of 
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l's in the sequence (TJ;)o<i< n -i, then L"=o T]i — 2B„ — n, and so |8]l is equivalent to 

> tj < 2exp ( z2 '-) , Vf > 0. 

This is of course a much stronger statement than the Strong Law of Large Numbers. 

The perspective of concentration inequalities is to look at the random variable 
Z n = Y!l=o 'Hi as a function of the individual variables rj,. Inequality (jHJ, when Z„ 
is normalized by n (since it can take values as large as n) can be phrased pretty 
offensively by saying that 

^ is essentially constant (= 0). 

The scope of concentration inequalities is to understand to what extent a general 
function K of n random variables Xq, . . . ,X„-i, and not just the sum of them, con- 
centrates around its expectation like a sum of Bernoulli random variables. Of course, 
the smoothness of K has to play a role, as well as the dependence between the X,'s. 

Stated informally as a principle, the measure of concentration phenomenon is the 
following: 

"A random variable that smoothly depends on the influence of many weakly de- 
pendent random variables is, on the appropriate scale, very close to a constant." 

This statement is of course quantified by statements like (JHJ or weaker ones, as we 
shall see. 

In the context of dynamical systems, there are many examples of random vari- 
ables K(Xq, . . . ,X n -\) which appear naturally but are defined in an indirect or com- 
plicated way. Concentration inequalities, when available, allow to obtain, in a sys- 
tematic way, a priori bounds on the fluctuations of K(Xq, . . . ,X„_i) around its ex- 
pectation by using a simple information on K, namely its Lipschitz constants. 



5.2 Concentration inequalities: abstract definitions 

We formulate some abstract definitions. 

Let Q be a metric space. A real-valued function K on Q" is separately Lipschitz 
if, for any i, there exists a constant Lip ( (/T) such that 

|^T(xo, • ■ ■ ,Xi-l,Xi,Xi+\ , . . . ,X n -l) — K(xq, . . . ,Xi—l,Xj,Xi+\ , ■ ■ ■ ,X n -l)\ 

< Lip t (*)</(jrj,jiE{) 

for all points xq,... in £2. 

Consider a stationary process {Xq,X\, ...} taking values in £2. 



Definition 4 (Exponential concentration inequality). 
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We say that the process {Xq,X\, . . .} satisfies an exponential concentration in- 
equality if there exists a constant C > such that, for any separately Lipschitz 
function K(xq, . . . ,x n ~\), one has 



E 



^(x ,...,x„_ 1 )-E[/f(x ,...,x, I _ 1 )]l < e cr e Zo u Pe(K) 



(9) 



In some cases, it is not reasonable to hope for such a strong inequality. This leads 
to the following definition. 

Definition 5 (Polynomial concentration inequality). 

We say that the process {Xq,X\ ,.. .} satisfies a polynomial concentration inequal- 
ity with moment p > 2 if there exists a constant C > such that, for any separately 
Lipschitz function K(xq, . . . ,jc„_i), one has 

(n-i yi 2 

E[\K(X ,...,X n ^)-nK(Xo,...,X„^)}\ p }<C^Up e (K) 2 \ . (10) 

An important special case of ( fT0] > is for p = 2, which gives an inequality for the 
variance of K(Xq, . . . ,X„_i): 

n-l 

Var(K(Xo,...,X n ^))<C £Lip^) 2 . (11) 

1=0 



After these definitions, a few comments are in order. 

• The crucial point in |9| and (JTOf is that the constant C does depends neither on K 
nor on n. It solely depends on the process. 

• These inequalities are not asymptotic, they hold true for any n. 

• Obviously (j9j is a much stronger inequality than ( fT0) i. For instance, one can get 
( fTT| from |9]) as follows: Multiply K by X ^ 0, substract 1 from both sides, divide 
by A 2 ; conclude by using Taylor expansion and by letting A go to 0. 

• An important consequence of the previous inequalities is a control on the devia- 
tion probabilities of K(Xq, . . . ,X„_i) from its expectation: 

If a stationary process {X n } satisfies the exponential concentration inequality |9]l 
then, for any t > 0, one has 

_ 1- 

F{\K(X ,...,X„_ l )-nK(X ,...,X n _ l )}\>t}<2e ^"=^ Pl w\ (12 ) 

If the process satisifies the polynomial concentration inequality ( fT0| i, one gets 
that for any t > 

(n-l 
t'=0 

(13) 



From limit theorems to concentration inequalities 

To prove ( |12) , we use Markov's inequality and (|9j: for any f,A > 

P {K{X , . . . ,X„_i) - E[tf(Xo, . . . ,X„_i )] > t } 

= P {exp (A (K(X , . . . - E[K(X , . . . ,X„-i)])) > exp(Af)} 
< e _A 'E /(^^o.-A-iJ-E^Xo x„_!)]) 

<e- X, e cl2 Z!=o Li Pri K ) 2 . 
This upper bound is minimized when A = ?/(2C£"~q Lip^(,K) 2 ), whence 
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P{A-(Xb,...,X„_i)-E[A-(X 



,X n -i)]>t}<e K n^ni*)\ 



The previous procedure is usually called the 'Chernoff bounding trick'. Of 

course, we can apply this inequality to —K and deduce at once ( [12) , 

Inequality ( fT3| > follows immediately from Markov's inequality. □ 



5.3 Concentration inequalities for dynamical systems 

We now present concentration inequalities in the setting of non-uniformly hyper- 
bolic dynamical systems. In a forthcoming paper with S. Gouezel 1 19 1 we prove the 
following theorems. Let us notice that we take separately Lipschitz observables for 
the sake of simplicity. All results are valid in the Holder case (see |[T9l Section 7.1]). 



5.3.1 Main results 



Theorem 11 (Exponential concentration inequality |19|). 

Let (Q,T,n) be a dynamical system modeled by a Young tower with expo- 
nential tails. Then it satisfies an exponential concentration inequality: there 
exists a constant C > such that, for any n £ IN, for any separately Lipschitz 
function K{xq, . . . ,x„-i), 

J e K(x,Tx T"- 1 x )-jK(y,Ty,...,T«- 1 y)ail<y) d p^ < gCEJ^Lip^tf) 2 (14) 



As a consequence of the Chernoff bounding trick (see the previous section), we get, 
for any t > and for any n £ IN, 
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IlixeSl: K(x, Tx,..., T^x) - / K(y, . . . , T"- l y)dn(y) > t 



< e <tc££o u Pj« 2 . (15) 

The same bound holds for lower deviations by applying ( fl"5j ) to — K. 

There are well-known dynamical systems (X,T) which can be modeled by a 
Young tower with exponential tails [56]. Examples of invertible dynamical systems 
fitting this framework are for instance Axiom A attractors, Henon's attractor for 
Benedicks-Carleson parameters 0, piecewise hyperbolic maps like the Lozi at- 
tractor, some billiards with convex scatterers, etc. A non-invertible example is the 
quadratic family for Benedicks-Carleson parameters. 



Theorem 12 (Polynomial concentration inequality 1 19 1). 

Let (Q,T,[l) be a dynamical system modeled by a Young tower. Assume that, 
for some q > 2, J R^dm" < °o. Then it satisfies a polynomial concentration 
inequality with moment 2q — 2, i.e., there exists a constant C > such that, 
for any n £ IN, for any separately Lip schitz function K(xo, . . . ,x n -i), 



J K{x,Tx,..,J n - l x)- jK(y,Ty,...,T n - l y)d^y) 



2q-2 

dju(jc) < 



' n— 1 \ 

\2 



C(£Lip^) 2 l . (16) 



As a direct application of Markov's inequality, we get from that, for any t > 
and for any n £ IN, 

fi^x£Q : \K{x,Tx,...J n - l x)- J K(y, . . . ,T"- l y)dfi(y)\ > t j < 

U f 2q-2 K ' 

For the Manneville-Pomeau map, we know that the exponential concentration 
inequality cannot be true. Indeed, Q is clearly an obstruction. Applying Theorem 
12 we get a concentration inequality with moment Q for any Q < ^ — 2 when 

a £ (0, 1/2). Applying ( fT3j ) yields a deviation bound in n~« +1+5 , for any 8 > 0. 
This is very close to the upper bound in n~a + l guaranteed by Theorem |7| In fact, 
one can get an optimal deviation inequality and get the latter bound, but we need the 
notion of a weak polynomial concentration inequality that we do not want to detail 
here, see |[T9l . 
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The first paper in which a concentration inequalities was proved for dynamical sys- 
tems is E51 : an exponential concentration inequality is established for piecewise 
uniformly expanding maps of the interval. For dynamical systems (X,T) modeled 
by a Young tower with exponential tails, a polynomial concentration inequality 
with moment 2 (variance) was proved in [ 16]. Regarding systems with subexponen- 
tial decay of correlations, the first result was obtained in |15| for the Manneville- 
Pomeau map ((Th: a polynomial concentration inequality with moment 2 was proved 
for a < 4 — V15- The above theorems, proved in |[T9l . improve all these results in 
several ways. 



5.4 A sample of applications of concentration inequalities 

We present some applications of concentration inequalities to show them in action. 
Some more, as well as all proofs, can be found in 1 17 18, 20l l25| . 



5.4.1 Warming-up with ergodic sums 

Let us apply the exponential inequality to the basic example is Kq(xq, ■ ■ ■ ,x n -\) = 

f(xo)-\ \-f(x n -i) where / is aLipschitz observable. We obviously have Lip ; (,Ko) - 

Lip(/) for any i = 0, . . . ,n — 1. When evaluated along an orbit segment x, . . . , T"~ x, 
we of course get the ergodic sum S„f(x). Assuming that ( fT3] > holds one gets 



ll < x e £2 



S n f{x)~ fdn 
n 



> t \ < 2e 4CL 'p</) 2 , Vf > 0. 



Compared with large deviations (see Subsection [43}, we observe that this is the 
right order in n. The large deviation result provides a much more accurate descrip- 
tion of this deviation probability as n — >• °°. But the previous inequality shows how 
small this deviation probability is already for finite «'s. 



5.4.2 Correlations 

Let (Q , T, ji) be an ergodic dynamical system and / : £2 — s- R be a Lipschitz observ- 
able such that J fdjj. — 0. An obvious estimator of the correlation coefficient Cf(k) 
(cf. Def.[T])is 

C f (n,k,x) = -Y 1 f(T j x)f(T^ k x). 

n j=o 

Indeed, an immediate consequence of Birkhoff 's ergodic theorem is that 
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Cf(n,k,x) > Cf(k), n — a.s. 

n— 

Observe that / Cf(n,k,x)djj. = C/(k) by the invariance of the measure. 
We have the following result. 



Theorem 13 (Correlation coefficients). 

Let T : £2 Q be a dynamical system modeled by a Young tower and pi its SRB 
measure. Let f : £2 — > R be a Lipschitz observable such that J /d/i = 0. 

• If the tower has exponential tails, there exists D > such that for any t > 
and any i;,ngl 



„2,2 



/i^xeQ: Cf(n,k,x)-Cf(k) >f|<2e 



• If for some q > 2, J R q dm" < °° then there exists G > such that for any 
t > and any k, n £ IN 

li[xeQ: C f (n,k,x)-C f (k) > tj<G\-^-j 



The proof is easy. One considers the function 



K(x , . . . ,x„ +jt _ 1 ) = £ f(xj)f(x j+k ) 



7=0 



of n+& variables. It is obvious that Lip ; (,K) < ||/||ooLip(/)/n. Applying (JT2J and 
( fT3") i yields the desired inequality. 



5.4.3 Empirical measure 



Let (£2 , T,n ) be an ergodic dynamical system. Birkhoff's ergodic theorem (see Sub- 



section 



2.3 1 implies that the empirical measure S n (x) = (l/n)L"=o^rjjc converges 
vaguely to ]ti. We want to obtain a 'speed' for this convergence, so we need to define 
a distance. We use the Kantorovich distance dist^. For two probability measures fi\ 
and jj.2 on £2 , it is defined as 

dist^i ,jU2) = sup jy gd/Xi - y gd/i2 : § : £2 -> 1R is 1 - Lipschitz | . 



This distance is compatible with the vague topology. 
We are led to consider the observable 
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K(x, Tx, . . . , T"- l x) = disk (£ n (x),n). 
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Theorem 14 (Empirical measure). 

Let T : Q. O be a dynamical system modeled by a Young tower with exponen- 
tial tails and H its SRB measure. Then, for any t > and for any (ieK 



pi { x e Q. 



disk (<?„(*),££)- / &ht K {£ n (y),ii)&ii{y) 



> 



< le-' 2 ^. 



This theorem follows at once from ( fl2] > and the fact that the function K defined 
above has all its Lipschitz constants bounded by 1/n. A natural step further is to 
try to get an upper bound for / dist K (<f„(-),/x)d/x. There is no general good bound 
in general; one has first to restrict to one-dimensional systems (because there is 
a special representation for the Kantorovich distance in terms of the distribution 
functions). Second, the regularity of the observables for which there is exponential 
decay of correlations is crucial. We mention only one result for the quadratic map 
T a (x) = 1 — ax 2 acting on 12 = [—1,1], where a G [0,2]. For Benedicks-Carleson 
parameters, we mentioned above that this system can be modeled by a Young tower 
with exponential tails. In fact there is an exponential decay of correlations for more 
general observables than the Holder ones, namely for observables with bounded 
variation |59|. This allows to prove that 

dist x (^(-),/i)d/i < — 
V" 

for some B > 0. Hence we deduce the following result from ( |14) . 



Theorem 15. 

Consider the map T a {x) = 1 — ax 2 acting on Q — [—1,1] for a in the 
Benedicks-Carleson set of parameters. Then there exist Djo > such that 
for any t >to and for any n £ IN 

filxeQ :dist x (4,(x),M) > -4= | <2e- D ' 2 . 



A natural question is to estimate the density of the absolutely continuous invari- 
ant measure of a one-dimensional dynamical system. A classical estimator is the 
so-called kernel density estimator. We refer to ifTTH TDl for details and results. 
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5.4.4 Tracing orbits 

We use concentration inequalities to quantify the tracing properties of some subsets 
of orbits. The basic problem can be formulated as follows. Let A be a set of initial 
conditions and x an initial condition not in A: How well can one approximate the 
orbit of x by an orbit from an initial condition of A ? One can measure the 'average 
quality of tracing' by defining 



y A (x,n)= l MY j d(Tj X) Tjy) 



j=0 

where d is the distance on £2. Assume that diam(i2) = 1. We have the following 
result. 



Theorem 16. 

Let T : £2 Q be a dynamical system modeled by a Young tower with exponen- 
tial tails and /I its SRB measure. There exists a constant c > such that for 
any subset A C X with strictly positive [l-measure, for any n € IN and for any 

H U € a : y A (*,n) > + ' 1 < e-' 2 ^ 



(where C > is the constant appearing in Theorem 111 



Proof. The function of n variables 

j n— 1 

K(x ,...,x„-i) = -wfYd(xj,T J y), 

is separately Lipschitz and it is easy to check that Lip,(/T) < 1/n for any i = 
0, ...,«— 1. We use ( |T2] > to get at once 

pL^x:y A {x,n)> J.y'Aiy^d^ + ^Ke-' 2 ^. (18) 

We now estimate J .y A (y,n)d(l(y) from above. Fix s > and define the set 

B s = : y A {x,n) > j y A (y,n)dn{y) + 

We have the identity 

/ y A (y,n)dn(y) = [ f A (y,nW(y) + [ J? A (y,n)dn(y) + [ y A (y,n)dn(y) 

J JA JA c nB c , JB, 
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The first integral is equal to by the very definition of 5? A . The second one is 
bounded by 

y A (y,n)dn(y) + ^\n(A c )- 
And the third one is bounded by fi(B s ) because y A (y, n) < 1. By ( fT~8] > one has 

m)<e-^ 4c . 

Hence 



i.e. 



J y A {y,nW{y) < (J y A { yi nW{y) + ^- U(A c )+ e ^/ 4c , 



j *> A (y,n)dfl(y) < ^(A)- 1 (' +e -*' 4cS ) . 
To finish the proof, it remains to optimize over s > 0. □ 

For a system modeled by a Young tower with polynomial tails, one can obtain a 
weaker bound, see |[T9l . 

5.4.5 Integrated periodogram 

Let (Q , T, ji) be an ergodic dynamical system and / : £2 — » R be a Lipschitz observ- 
able with / fdfi = 0. Define the empirical integrated periodogram of the process 
{/oT*} by 

r<0 1 «-i 2 

3 n {x,(o)= j ~ £« _v 7(r J je) dj, a>e[o,2w]. 

;=0 



o n 



Let 

3W, C/ (0) ffl+ 2^C#), 



*=l ^ 



that is, the cosine Fourier transform of the sequence of correlation coeficients. (Re- 
call that C f (k) =ff-fo T k dfi.) One can prove the following theorem. 



Theorem 17. 

Let T : Q Q be a dynamical system modeled by a Young tower with exponen- 
tial tails and jx its SRB measure. Let f : Q — >• R, be a Lipschitz function such 
that J fdn = 0. There exist some positive constants c\ 1 C2 such that for any 
n £ IN and for any t > 
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H<xeQ: sup \3n(x,G>)—d(o>)\ >t 



c^l+logn) 3 / 2 



, <e -c 2 nt 2 l(l+\ogn) 2 



The proof can be found in |[T9l . 



5.4.6 Almost-sure central limit theorem 



We come back to the almost-sure central limit theorem (cf. Subsection 4.5 1. Let / 
be a Lipschitz observable such that J fdfX = 0. For convenience, let 

1 "1 

This is a random measure on R. Given x E Q, srf„{x) is a measure. To measure its 
closeness to the Gaussian law J\C %, we use the Kantorovich distance dist K . For two 

probability measures \i\ and \ii on R, it is defined as 

distr(/li,^2) = sup < / gdjXi ~ / gdji 2 : g : R -> 1R is 1 - Lipschitz . 



Convergence in this distance entails both weak convergence and convergence of the 
first moment. 



Theorem 18 (Almost-sure central limit theorem). 

Let T : Q Q be a dynamical system modeled by a Young tower such that 



I" 



2 dm u < oo 



and jX its SRB measure. Let f : Q. — > R be a Lipschitz observable with J fd[i 



0. Assume that (jj > 0. Then 



dist K (£/„(x),^V a i) ^ for pL — a.e. x e i2. 



This is slightly stronger than the usual almost-sure central limit theorem. In fact, 
a more general statement is true: if a process {X^} satisfies the central limit theorem 
and ( p"Tj ), then the previous theorem is true. This is the way it is proved in ifTTIl . 
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6 Open questions 

In this section, we list various open questions. The list we present is by no means 
exhaustive. 



6.1 Random dynamical systems 

In order to model the effect of noise on a discrete-time dynamical system, it is 
natural to introduce models obtained by compositions of different maps rather than 
by repeated applications of exactly the same transformation. The idea is to study 
sequences of maps 'picked at random' in some stationary fashion. We refer to [40, 
Chap. 5] for a survey. 

The simplest case is the following. We assume that the phase space is contained in 
R d and that there is a sequence of i.i.d., ff'-valued, random variables t,\ , . . . such 
that, instead of observing the orbit of the initial condition x, one observes sequences 
{x n } of points in the state space given by 



where e is a fixed parameter (the amplitude of the noise if \% n \ is of order one). 
The process {x n } is called a stochastic perturbation of the dynamical system T. By 
construction, it is a one-parameter family of Markov chains. If we assume that £,„ 
has a density p with respect to Lebesgue measure, the transition probability of the 
chain is given by 



One expects that in the limit £-)0 (the zero-noise limit), the right-hand side con- 
verges to 8(x n+ i —T(x„)) and that, if jj. £ is an invariant measure for the chain, 
then its accumulation points (in vague topology) should be invariant measures for 
T. There are reasons to believe that under fairly general conditions, SRB measures 
may be natural candidates for zero-noise limits, hence they should be stochastically 
stable. This is indeed proved for Axiom A systems and certain non-uniformly hy- 
perbolic systems, see e.g. [26| and |6| for the Henon map. 

A natural question is to prove concentration inequalities for random dynamical 
systems, in particular for the additive noise model. This would lead, for instance, 
to quantitative informations on the distance between the empirical measure of the 
process {x n } and the SRB measure jj. as a function of n and e. 

The above setting concerns 'dynamical noise'. Another relevant situation is 'ob- 
servational noise' : one observes the process y„ = x n + e£,„ and the goal is merely to 
extract {x,,}, and eventually try to reconstruct T [48 1. 



x„+\ = T(x n )+£%, 
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6.2 Coupled map lattices 

Coupled map lattices are a class of (discrete-time) spatially extended dynamical 
systems which were introduced in the 1980's by physicists. We refer to the lecture 
notes [13] for more details and background. 

The basic set-up is a state space £2=1^ where / C 1R is a compact interval, 
typically [0, 1]. There is a 'local' dynamics t :I Q which defines an 'unperturbed' 
dynamics T$ on Q. by (To(x)). — t(x,), i G ifi. Then one defines a perturbed dy- 
namics by introducing couplings <P £ : Q O of the form <P e (x) = x+A e (x). The 
basic (and most studied) example is the 'diffusive' nearest neighbor coupling 

(& e ( x )).= Xi + ± £ (xj-xi), iG Z d . 
\'-J\= l 

Of course, £ measures the strength of the coupling. 
The dynamics we are interested in is 

T £ := <P £ oTq. 

The study of such dynamical systems offer many challenges and a lot of questions 
remain open lfT3l . 

From the point of view of probabilistic properties, the following is known, see 
O and references therein. The local map T on the unit interval / is assumed to be 
continuous and piecewise C 2 . The expansion rate is assumed to be bigger than 2: 
| t'| > 2 and both the first- and second-order derivatives are bounded. The couplings 
are assumed to be diffusive and of finite range (the above example corresponds to a 
range equal to one). Under these conditions, the coupled map lattice T £ has a unique 
observable measure jx £ in the sense that, for m® z -almost every point x € Q. state, 

1 

1 r- vaguely 

-2-01*,— > Me- 

" k=0 

This measure is exponentially mixing both in time and space. Moreover, any Lip- 
schitz function on 7 Z depending on a finite number of coordinates satisfies the 
central limit theorem with respect to (T £ ,jj, £ ). The authors also prove a local limit 
theorem. All these results hold provided that e is small enough. As the authors point 
out, their tools also allow to prove exponential large deviations. 

A natural question is to prove concentration inequalities in this context. One 
expects an exponential concentration inequality to hold. 
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6.3 Partially hyperbolic systems 
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As mentioned above, the theory of hyperbolic dynamical systems initially developed 
from the notion of uniform hyperbolicity. This notion can be weakened in essentially 
two ways. One of these is to retain hyperbolicity without uniformity, which leads 
to the theory of non-uniformly hyperbolic dynamical systems. The class of systems 
modeled by Young towers described in this chapter is an important subclass of such 
systems. 

The other generalization is to retain uniformity without hyperbolicity by allowing a 
center direction in which any expansion or contraction is in a uniform way slower 
than the expansion and contraction in the unstable and stable subspaces. Such sys- 
tems are called partially hyperbolic. Among the basic examples are time-one maps 
of Anosov flows (the center direction is the flow direction), quasi-hyperbolic toral 
automorphisms and mostly contracting diffeomorphisms. We refer to l40l Chap. 1] 
for a survey. 

In ||29ll , the author proves many probabilistic results such as the central limit the- 
orem (and its refinements like the almost-sure invariance principle) and exponential 
large deviations. 

It would be nice to establish concentration inequalities for partially hyperbolic sys- 
tems. 



6.4 Nonconventional ergodic averages 

Nonconventional or mutiple ergodic averages are typically of the form 
~l,MT k x)MT 2 *x)...MT*x). 

That is, one considers the averages of products of, say, bounded measurable func- 
tions along an arithmetic progression of length I for an arbitrary integer I > 1. 
The case I = 1 is of course the standard case. Such averages originated in the er- 
godic theoretic proof by Furstenberg of Szemeredi's theorem on arithmetic pro- 
gressions based on the so-called multiple recurrence theorem (32]. For a dynamical 
system (X 7 T,fi) which is weakly mixing, the above averages converge in L 2 to 

The next questions are about fluctuations of nonconventional averages when the 
fj's are, say, Lipschitz functions : central limit theorem, large deviations and con- 
centration properties. Regarding the central limit theorem, a first step was done by 
Kifer [45 1 for uniformly hyperbolic systems (for averages along more general pro- 
gressions). Large deviations seem much more difficult to analyse and turn out to be 
nontrivial even for i.i.d. processes (see [12|). 

A transfer operator approach remains to be introduced to tackle such problems 
because the usual machinery does not seem appropriate. Remarkably, concentration 
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inequalities, if available for the system at hand, apply straightforwardly and provide 
nontrivial informations while they 'ignore' the fine structure of theses averages. We 
leave as an exercise to the reader the derivation of such concentration bounds. 



6.5 Erdos-Renyi law for nonuniformly hyperbolic systems and 
applications to multifractal analysis 



We come back to large deviations (see Subsection 4.3 1. When a rate function does 



exist for a dynamical system (see Theorem|6]l, the following question is natural: 

given an observable /, is it possible to extract the rate function I f solely from a 
typical orbit of the system ? 



With a different motivation, this question was answered by Erdos and Renyi IBDl in 
the context of i.i.d. random variables. In the context of dynamical systems, the first 
result was obtained in lfl4l for a class of piecewise, uniformly expanding maps of 
the interval. For this class, Theorem [6] is valid and one can in fact get refined large 
deviation estimates necessary to obtain the following result. Given an observable / 
and t in the domain of If, let 

M k (x) = max {S*/(7^) : < j < Lexp(%(f))J —k} 



In words, we are looking for the largest ergodic sum of / in a window of width k 
inside the orbit of x up to time [exp(H/(f))J — k. 
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Theorem 19 (Erdos-Renyi law for uniformly expanding maps of the in- 
terval 1 14 1). 

Let T : [0, 1] O be a piecewise C 2 , uniformly expanding map which is topo- 
logically mixing and /I its unique absolutely continuous invariant measure. 
Let f : [0, 1] — ¥ E, be an observable of bounded variation^ Then, there exists 
t* > such that, for any \t \ < t * and for Lebesgue-almost every x G [0,1] 

*-**> k 

More precisely, one has almost everywhere 

M k (x)~kt 1 
hmsup — - — < — 

jfc-K* lOgK lU 

and 

limM MM^kt > _ L 

ife->oo log k 2u 

where u = I'At) . 



Notice that this theorem gives an optimal rate of convergence, the same as in the 
i.i.d. case obtained by Deheuvels et al. (see 03]). 

In view of Theorem [6] and the technique used in [14 |, one expects that Theorem 



19 be true for systems modeled by a Young tower with exponential tails. This was 



partially showed in |28|, but only in the one-dimensional case, and with a non- 
optimal rate. 

On the side of applications, Theorem[6]allows to construct an estimator for Iy. This 
is particularly relevant to the estimation of multifractal spectra, see 0]. 



7 Notes on further results 



We quickly describe or barely mention other results that we could not develop in the 
main text. 



7.1 More on the central limit theorem 



It is natural to ask for a speed of convergence in the central limit theorem. This type 
of result is called a Berry-Esseen theorem. 
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For systems modeled by a Young tower with exponential tails, one has the fol- 
lowing. Let / : Q — > 1R be a Holder continuous observable. Assume that oy > 0. 
Then there exists a constant c = c(f) > such that 



The speed of convergence can be slower. Let us again illustrate this by looking at 
the map T a given by Q. For < a < 1/2 and / Holder continuous (which is not 
of the form g — g o T a ), we know that the central limit theorem holds (see end of 
Section |4~2| i. 

• IfO<05<l/3 then one gets a speed of order 0(\ /y/n) as above. 

• If 1/3 < a < 1 /2 and /(0) ^ 0, the speed is ^(l/n^ 1 ). 

We refer the interested reader to ll36l for more details and proofs, where a 'local 
limit theorem' is also proved. 



7.2 Moderate deviations 

One can also characterize the fluctuations of S„f which are of an order intermedi- 
ate between ^Jn (central limit theorem) and n (large deviations). Such fluctuations, 
when suitably scaled, satisfy large deviations type estimates with a quadratic rate 
function determined by aj. We have the following theorem: 



Theorem 20 (Moderate deviations |54 |). 

Let T : Q Q be a dynamical system modeled by a Young tower and 11 its SRB 
measure. Assume that m u {R > «} = ff(e~ an ) for some a > 0. Let f : Q — > 1R 
be a Holder continuous observable which is not of the form g — goT (whence 
Of > 0). Let a„ be an increasing sequence of positive real numbers such that 
lim^oo a n /y/n — °° and lim^oo a n /n — 0. Then for any interval [a,b] Clwc 
have 



For the case of systems modeled by Young towers with polynomial tails, see [50). 
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The almost sure invariance principle is a very strong reinforcement of the central 
limit theorem: it ensures that the trajectories of a process can be matched with the 
trajectories of a Brownian motion in such a way that almost surely the error between 
the trajectories is negligible compared to the size of the trajectory. 

For X £ (0, 1 /2] and E 2 a (possibly degenerate) symmetric semi-positive-definite 
dx d matrix, we say that an R/'-valued process (Aq,Ai, . . .) satisfies an almost sure 
invariance principle with error exponent X and limiting covariance E 2 if there exist 
a probability space 3? and two processes (Aq,A\ ,.. .) and (Bq,B\ , . . .) on 3 such 
that: 

1. the processes (Ao,A\,. . .) and (Aq,A\, . . .) have the same distribution; 

2. the random variables (Bq,Bi,...) are independent and distributed as jVq l i ; 

3. and almost surely in 3* 

n— 1 n— 1 

A Brownian motion at integer times coincides with a sum of i.i.d. Gaussian vari- 
ables, hence this definition can also be formulated as an almost sure approximation 
by a Brownian motion, with error o(n^). 

In the dynamical system context, take A( — f o T IL where / : Q. — > IR^ is regular. 
It is proved in [52] by martingale methods and then in ll37ll with purely spectral 
methods, that a dynamical systems modeled by Young towers satisfy the almost- 
sure invariance principle. Namely, this is the case if J R q dm" < °° for q > 2 and for 
observables / : Q M which are Holder continuous. The relevance of consider- 
ing IR^-valued observable is that, for instance, the position variable of the planar 
periodic Lorentz gas with finite horizon approximates a two-dimensional Brownian 
motion. 

The almost-sure invariance principle implies in particular the central limit theo- 
rem, the functional central limit theorem, and the law of iterated logarithm, among 
others, see e.g. ||38ll53l . It also implies the almost-sure central limit theorem l47ll . 

Acknowledgements The author thanks Sebastien Gouezel for useful comments. He also thanks 
Cesar Maldonado and Mike Todd for a careful reading. 
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